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We study two measures of the complexity of heterogeneous extended systems, taking random 
Boolean networks as prototypical cases. A measure defined by Shalizi et al. for cellular automata, 
based on a criterion for optimal statistical prediction [T], does not distinguish between the spatial 
inhomogeneity of the ordered phase and the dynamical inhomogeneity of the disordered phase. A 
modification in which complexities of individual nodes are calculated yields vanishing complexity 
values for networks in the ordered and critical regimes and for highly disordered networks, peaking 
somewhere in the disordered regime. Individual nodes with high complexity are the ones that pass 
the most information from the past to the future, a quantity that depends in a nontrivial way on 
both the Boolean function of a given node and its location within the network. 



I. INTRODUCTION 

In computational mechanics, the complexity of a pro- 
cess that generates a single time series is defined as the 
least amount of information required for a maximally 
accurate statistical description of the series [2H1]- This 
definition classifies random processes, as well as simple 
periodic ones, as having low complexity. In a 2004 arti- 
cle, Shalizi et al. extended the definition to spatially- 
extended dynamical systems and introduced an algo- 
rithm for measuring the complexity of a discrete system 
given time series data for all components Applying 
the algorithm to 2D cyclic cellular automata (CCA) con- 
firmed that it classified CCA rules generating fixed states 
or incoherent local oscillations as having low complexity 
and cases that produce turbulent spiral waves as having 
high complexity. 

Shalizi 's complexity measure, C^, is defined as the 
amount of information stored in local causal states, where 
a causal state is an equivalence class of all past configu- 
rations that give rise to the same distribution of future 
outcomes. A set of causal states can be discerned from 
time series data for all elements in the system. The local 
complexity (or "complexity density") is obtained by con- 
sidering local spacetime regions consisting of truncated 
past and future light cones for each node in the system. 

We study two complexity measures that differ only in 
the choice of the ensembles of spacetime points used for 
averaging the local complexity. One approach consid- 
ers the ensemble of all spatial points at the same time 
instant, which corresponds to Shalizi's C^. In comput- 
ing C^, all nodes are given equal weight in determin- 
ing the probabilities of observing different states. Shalizi 
et al. used to investigate self-organization of cellular 
automata, which are logically and topologically uniform 
networks of discrete, interacting elements. 

A second approach is to assign a complexity to each 
individual element by averaging over time. The system 
average of these individual complexities is denoted C^. 
For systems that are statistically homogeneous in time 
and space, and Ci, are the same. We suggest that 
is the one that is most informative for spatially inhonio- 
geneous systems. 



In this paper, we analyze the dependence of and Ci, 
on parameters specifying ensembles of random Boolean 
networks (RBNs) with quenched network topology and 
logic functions. RBNs were first studied by Kauffman 
as toy models of gene regulatory networks [5, 6J. They 
have garnered much attention in the last few decades, 
and features such as steady-state bias, sensitivity to per- 
turbations, attractor length and mutual information be- 
tween nodes have been extensively investigated [THTO]. 
We show here that for any given distribution of logic 
functions, the value of for a RBN can be analytically 
calculated as a function of the bias p and is not simply 
related to the sensitivity A. on the other hand, is 
always near zero for sensitivity values A < 1, where the 
network is in the ordered or critical regime, and is also 
zero for the highest possible A value, where the network 
dynamics is strongly chaotic. Thus C^, reflects the in- 
tuitive notion that systems with short periodic cycles or 
apparently random behavior should both have low com- 
plexity. The maximum of C^, for RBNs occurs somewhere 
in the disordered regime. We find also that the amount 
of information processed by any given individual depends 
on global properties of the network dynamics as well as 
the logic functions of the node in question and others in 
its neighborhood. 

Section II defines the two measures and C^, in de- 
tail. Section III describes an implementation of these 
definitions in the context of RBNs and presents theoret- 
ical and numerical results on the complexity of a certain 
class of RBNs. The relation between network complex- 
ity and sensitivity, and the relation between individual 
nodes' complexity and role in determining the network 
dynamics is also discussed. We conclude with some gen- 
eral remarks and suggestions for future research. 



II. COMPLEXITY MEASURES 

The Grassberger-Crutchfield- Young statistical com- 
plexity is defined as the least amount of information 
about the past trajectory required for optimal prediction 
of future trajectory, given time series data for a single 
variable [¥ . This measure is calculated from time series 
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data alone, without reference to the physical laws that 
govern the dynamical processes. Shalizi et al. extended 
the concept to processes with spatial extent [T] . We sum- 
marize the basic ideas here. For details and information- 
theoretic support of these ideas, see Ref. [11 . 

Given a field X that varies over space and time in a sys- 
tem where information propagates at a maximum speed 
of c, the past light cone of a space-time point (r, t) con- 
sists of all space-time points where events can influence 
X{r,t). L~{f,t) is the configuration of the field Xin the 
past hght cone: 

L^{f,t) = {{X{s,u),t- u,r- s) 

V M < t and |s - r| c(i- u)} (1) 

Note that each element of consists of three quanti- 
ties: a field value, a time lapse, and a relative position. 
Two space-time points that are infiuenced by the same 
past events thus have the same L~ . Similarly, L^(f,t) 
is the field configuration in the future light cone, the set 
of points which could be influenced by what happens at 
(f,t). Each space-time point is associated with one 
and L+ . An ensemble of such pairs deflnes a probability 
distribution P(i+|L") for future light cone conflgurations 
conditioned on past configurations. A causal state e{L^) 
is defined as a set of past light cone configurations that 
have the same distribution of future configurations. All 
instances of a given causal state predict the same distri- 
bution of future light cone figurations: 

e(r) = {A : P{L-\L- = A) = P{L+\L- = r)}. (2) 

By definition, e(l^) is a sufficient local statistic; know- 
ing the causal state provides the same predictive power 
as knowing its exact past light cone configuration. e{l~) 
is a minimal sufficient statistic [12] , meaning that the suf- 
ficient statistic e{l~) contains the least amount of infor- 
mation among all statistics that have the same predictive 
power: 

H[e{l-)] < H[^{1-)], (3) 

where ri{l~) is a sufficient statistic and H[X] — 
— P{X = Xi) log2 P{X = Xi) denotes Shannon en- 
tropy. It then follows that H[€{1^)] is the least amount 
of information for optimal prediction of the future dy- 
namics [TJ [TT] , which is taken to be the relevant measure 
of a system's complexity. We use the shorthand notation 

C^H[e{r)]. (4) 

The value of C depends upon the choice of the en- 
semble of space-time points used to determine the causal 
states. We consider two choices that are equivalent for 
spatially homogeneous systems but not for RBNs or other 
inhomogeneous systems. For C^, causal states are deter- 
mined at any given time by considering the ensemble of 
spatial locations in the system at that time: 

e,,(r, i)- {\:P{L+{r',t)\L-{r',t)=\) 

= P{L+{r,t)\L-ir,t))^l-}. (5) 



This approach allows one to speak of the complexity of a 
system as a function of time, which may exhibit transient 
dynamics. Systems that exhibit an spontaneous increase 
in Cfi{t) have been described by Shalizi et al. as go- 
ing through a self-organization process [fj. For present 
purposes, we use to refer to the complexity after tran- 
sients have decayed. 

Alternatively, Cjy is based on causal states deflned for 
each spatially distinct component of the system, using 
the ensemble of light cone transitions at different times: 

e,(r, r)={X: P{L+ {r,t')\L- {r,t') = A) 

^PiL'{r,t)\L-{f,t))^l-}. (6) 

Ci,{r) receives negligible weight from transients, and in 
practice we compute it by taking data only after tran- 
sients have relaxed. Because Ci,{r) is associated with a 
particular element of the system, it provides information 
about the role that element plays in the dynamical evo- 
lution of the system. Averaging over all r, we obtain a 
global complexity measure Ci^. 

In practice, estimating the complexity requires restrict- 
ing the depths of the light cone conflgurations to a man- 
ageable size. For the Boolean networks discussed below, 
we will show that it is also sufficient to consider a single 
update step in the past and a single step in the future. 

III. COMPLEXITY OF RANDOM BOOLEAN 
NETWORKS 

We study C^i(i) and Ci,{r) in synchronously updated 
RBNs to determine how (or whether) they are related 
to well-understood measures of the dynamics, such as 
the sensitivity of the network or the overall bias in the 
values of the binary variables. In a synchronous RBN, at 
each time step each node i is updated based on a logic 
function fi that is applied to the current values of its 
input nodes: 

Xi{t) = fi{Xii{t- l),Xi2{t~ l),...,Xin{t~ 1)), (7) 

where t is a positive integer. Here fi : {0,1}" — > {0,1} 
is a quenched Boolean logic function for node i, chosen 
randomly from some fixed distribution over the possible 
logic functions, and Xij^s are the binary values of the 
nodes that provide inputs to node i. 

The number of inputs and outputs per node, which 
may or may not be the same for all nodes, and the dis- 
tribution of logic functions are the parameters that char- 
acterize an ensemble of synchronous RBNs. From these 
parameters, two global measures, the bias and the sensi- 
tivity, can be calculated analytically [7, . The bias p is 
the fraction of nodes with value 1 after transients have 
decayed. Let x S (0, 1)^ be an input vector of length 
K , and K be the fixed number of input per node for a 
network, then the bias map of the network is given by 

P{i + 1) = - pW)''"'''')' (8) 
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where (•) denotes expectation taken over distribution of 
logic functions and |x| is the number of Is in x. The fixed 
point or cychc behavior of the bias is determined by the 
bias map. The sensitivity A is the average rate of in- 
crease of the Hamming distance between two state space 
trajectories that initially differ at only a small number of 
nodes: 

2 — 1 X 

where x^*^^) = {xi, ...,x^_i,j,Xi+i, ...,xk) and de- 
notes the XOR function [S' . The sensitivity distinguishes 
qualitatively different network behaviors that have been 
termed ordered (A < 1), disordered (A > 1), and critical 

(A = 1)113]. 

For the present study, we fix the number of inputs and 
outputs of the nodes such that the light cones of all nodes 
have the same shape. We study the simplest nontrivial 
case, in which all nodes have exactly two inputs and two 
outputs. For these networks, the greatest possible sensi- 
tivity is A = 2. 

Ref. [1] outlined an algorithm for computationally dis- 
tinguishing the causal states of cellular automata, and 
we can implement similar procedures to distinguish the 
causal states of our RBNs. Because nodes in RBNs are 
not assigned spatial positions, the definition of L~ and 
i+ requires some technical modifications. In the calcu- 
lation of C^, L~{i,t) of node i at time step t is defined 
as 

L'f^iht) ^ {{X{j,u), t-u, d(i,j)) 

y u <t a,nd d{i,j) <t~ u}, (10) 

where d( i, j) is the shortest distance between nodes i and 
j. By this definition, different nodes that have the same 
distance from a given node are deemed indistinguishable 
for the purpose of identifying a light cone configuration. 
This is because there is no meaningful way to index the 
inputs and outputs of nodes for the purpose of comparing 
light cone configurations of different nodes. Although 
the update rules (Eq. [t]) contain indices for node inputs, 
the calculation of statistical complexity is done without 
the knowledge of the underlying physical laws and hence 
without the knowledge of which input is which at any 
given node. 

For calcuating the complexity Ci,{i) of an individual 
node, however, different inputs and outputs are distin- 
guishable. When comparing two past or future light cone 
configurations of the same node i at different time steps, 
one can keep track of which input is which and observe 
that when the two inputs differ, the future light cone 
configuration depends upon which input value is 1. Con- 
sequently, L~{i,t) is defined as 

t) = {{X{j, u), t—u, j)\f u < t and d{i,j) < t—u}. 

(11) 

Following Ref. |T] we limit the depth of light cone to 1 . 
There are two features of a RBN that ensure that light 



cones of depth 1 yield the same complexity values as light 
cones of any other depth. First, there is no memory of 
previous states in the RBN rules: the configuration att — 
t' depends only on what happens a,t t — t' ~1. Secondly, a 
node's past light cone typically only influences its future 
light cone through the node itself. Fig. [T] illustrates the 
crucial difference between a RBN and a regular lattice. In 
the latter, a node's past light cone can influence its future 
light cone through multiple paths of the same length as 
the path passing through the node of interest, whereas in 
an RBN, the chance of finding such a path is vanishingly 
small in the limit of large system size. For these reasons, 
even though the definition of statistical complexity relies 
on arbitrarily deep light cones, the results for depth 1 
become exact for RBNs with system size — oo. This 
fact enables us to calculate and, in some cases, Ci, 
analytically in the large system limit. 

is calculated from depth- 1 light cones as follows. 
For a node with two inputs i and j, there are only three 
possible past light cone configurations: {0, 1}, {1, 1}, and 
{1,0} or {0, 1} (note we have omitted the relative time 
and distance entries in writing the light cone configu- 
rations because they are now trivial after we restrict the 
light cone depth to 1), occurring with probability (1— p)^, 
p^, and 2p{l ~ p), respectively, and yielding probabilities 
for the reference node being ON of (/(0,0)), (/(1,1)), 
and l/2((/(0, 1)) + (/(1,0))) respectively. As discussed 
above, the probability of observing a future light cone 
configuration, i+, depends only on the state of the ref- 
erence node. Thus, if each of the three possible i~'s 
yields a unique probability for the reference node to be 
ON or OFF, which in turn yields a unique distribution 
of L+'s, then each L~ is itself a causal state and we have 

= -p' log2(p2) _ _ i^g^((^ _ ^)2) 

-2p(l-p)log2(2p(l-p)). (12) 

Modifications to Eq. [l2] are required if the above as- 
sumptions do not hold. For example, if any two of 
(/(O, 0)), l/2((/(0, 1)) + (/(!, 0))) and (/(1, 1)) are equal, 
then there would be less than three causal states. The 
following scenario is also possible. If the state of a ref- 
erence node x is 1, the probability that an output is 1 
is 

p+ = p (/(I, l))+l/2(l-p) (/(I, 0))+l/2(l-p) (/(0, 1)) ; 

(13) 

and if a; = 0, the probability that an output is ON is 

Pf[ = (1 - P) (/(0, 0)) + l/2p (/(I, 0)) + l/2p (/(0, 1)) . 

(14) 

In the case of an accidental degeneracy = Pq , the 
distribution of is independent of x and therefore in- 
dependent of , in which case all three possible i~ 
collapse to a single causal state, yielding — 0. 

Fig. [2] shows the comparison between the analytical 
calculation of and simulation results for ensembles 
of networks with mixtures of two logic functions. The 
horizontal axis denotes the fraction of nodes that get 
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FIG. 1. A regular lattice network and a random network 
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assigned the indicated Boolean function. The ensem- 
ble in Fig. [2f^a) satisfies the requirements for Eq. 12 
The ensemble in Fig. only has two causal states 

because (/(0, 0)) = (/(l, 1))=1 - q, and l/2((/(0,l)) + 
(/(1,0)))=1, where q is the fraction of XOR nodes in 
a XOR-ON network. The first two terms in Eq. [12] col- 
laps into one for such an ensemble, and the corresponding 
is given by 

c, = -ip' + {1 - pr)iog,{p' + {1 - pf) 

-2p(l-p)log2(2p(l-p)). (15) 

From Eq. |8]we can obtain the fixed-state bias for a XOR- 
ON network to be 



-l + 2g+ v/l+4g- V 
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(16) 



which is monotonically decreasing from p = 1 to p = 0.5 
for q £ [0,1]. We could subsequently obtain a compli- 
cated expression of in terms of q which we would omit 
here, but we can see that C^{q) is an increasing function 
in g € [0,1] because C^(p) monotonically decreases in 
p S [0.5, 1]. Simulation results indicate that does not 
change noticeably for a given ensemble for network sizes 
above N = 1000. The simulations for N = 10^ agree 
well with the large system analytical result. The dis- 
agreements at g = 0.05 and q = 1 in the Fig. [2]3 are due, 
respectively, to the fact that differences between causal 
states are too small for the simulations to resolve (1 — <? 
too close to q) and to a collapse of the type described in 
the previous paragraph (pf = p^). For some choices of 
Boolean functions, the bias p can oscillate, which leads 
to persistent oscillations in C^. 

Critical networks (with A = 1) have been hypothesized 
to have properties that might be favored by natural se- 
lection or other self-organized processes |S] . Our findings 
show that is not simply related to sensitivity, so that 
maximization of does not correspond to selection of 
critical networks. Fig. [2] illustrates this point with two 



examples. In (a), we see that the network is critical at 
both 5 = and q = 0.5 and that is a maximum in 
one case but zero in the other. In (b), we see that in- 
creases monotonically as A increases from 1 (the critical 
value) to 2. 

The fact that can be high in ordered systems (A < 
1), where the attractor dynamics is trivial, highlights the 
fact that the spatial inhomogeneity of states alone can 
produce a high complexity. In these networks, almost 
all nodes are frozen on a fixed value, independent of the 
initial conditions, but different nodes may be frozen on 
different values and becomes a measure of the degree 
of variation from node to node. The fact that can be 
high in strongly disordered systems (A near the maximum 
possible value of 2), reflects the tendency of the bias p to 
approach values that maximize at these points. 

The calculation of Ci, treats each component as an 
agent with its own causal states and complexity, then 
averages those complexity values. When causal states 
are determined by considering the past and future light 
cones of a single node at different times, every frozen node 
has zero complexity, as does any node for which the past 
and future light cones are uncorrelated. In ordered or 
critical networks, where only a vanishingly small number 
of nodes are not frozen, C^, is very close to zero. In 
highly disordered systems, the behavior of most nodes 
closely approximates a purely stochastic process, so 
is again near zero. Thus is maximized somewhere in 
the disordered regime. 

Fig. |3] shows a typical plot of complexity Ci, as a func- 
tion of sensitivity A. Recall that A = 2 is the highest 
sensitivity value possible for a system in which each node 
has exactly two outputs. All ensembles of systems with 
a full range of sensitivity values exhibit the same gen- 
eral relation between and A, with Cn maximized at 
different A from ensemble to ensemble. 

We have not found a way to calculate Ci, analytically 
for networks with a general combination of logic func- 
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FIG. 2. The steady-state bias p, sensitivity A and complexity Cp, for networks consisting of two types of nodes, (a) A fraction q 
of the nodes are assigned the AND function (/(O, 0), /(0, 1), /(I, 0), /(1, 1)) = (0, 0, 0, 1) while the others are assigned (1, 0, 1, 1) 
(IF), (b) A fraction q are assigned (0, 1, 1,0) (XOR) and the rest (1, 1, 1, 1) (always ON). The solid curve and circular points 
are respectively the analytical and simulation results for C^. The size of networks is A*' = 10'*, and time of data collection for 
each network realization is T = 10®. For each of the 21 ratios of the two logic functions (21 dots in the graph), 30 RBNs are 
constructed. For each RBN, 30 runs are simulated with different initial conditions. 



tions, but we can do it (in the large system limit) for 
the special case of the networks represented in Fig.[2jb), 
which contain only the logic functions XOR (0110) and 
ON (1111). In this case, the complexity of each node is 
either or 1, depending on whether it and/or its neigh- 
bors are frozen or not. 
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FIG. 3. A typical pattern for complexity vs sensitivity A. 
Cv is close to zero for A < 1 and for the maximum value of 
A = 2. is maximized near A = 1.5. The two logic functions 
are (0, 1, 1, 0) and (1, 1, 1, 1), as in Fig. ^h). The solid curve 
shows theoretical results and the circular dots show simulation 
results. The size of networks is A = 10* and time of data 
collection for each network realization is T = 10®. For each 
of the 21 ratios of the two logic functions (21 dots in the 
graph), 30 RBNs are constructed. For each RBN, 30 runs are 
simulated with different initial conditions. 



The first step in calculating is to find the fraction 
7 of nodes that are frozen. In general, for networks of 
two-input logic functions in which the probability that 
a node is frozen when a subset of its inputs are frozen 
is independent of the value of those frozen inputs, 7 is 
given by a solution of the equation 



Po(l — xY + '2pix{l — x) +P2X^ — X , 



(17) 



where pk is the probability that a node will be frozen if ex- 
actly k of its inputs are frozen. (See [M], Eq. (70).) The 
XOR-ON networks satisfy the requirements and have 
Po = Pi = 1 ~ 9 E^iid p2 ~ I, where q is the fraction 
of XOR nodes, which yields 



7 = (1 - q)/q- 



(18) 



Eq. 18 yields values that are greater than I for q < 1/2. 



This means the fraction of frozen nodes approaches 1 
when network size is large. 

The unfrozen nodes are all XORs, some of which have 
one frozen input and therefore act either as a copier or 
inverter of the other input. The network of unfrozen 
nodes thus acts £IS £L network in which every node exe- 
cutes a parity function (or its inversion). Our calcula- 
tion of Cj/ relies on the following conjecture: all nodes 
in networks consisting of only parity functions (for arbi- 
trary numbers of inputs to each node) will have bias 0.5 
when averaged over initial conditions. We further conjec- 
ture that the bias is not affected by being embedded in a 
larger network in which all other nodes freeze after some 
transient. Our conjectures are supported by numerical 
simulations of 10,000 randomly generated 16-node net- 
works with only XOR and ON gates. We find that the 
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bias of 0.5 on each unfrozen node is maintained on each 
individual time step, not just after averaging over time. 
In many of these networks, it can be seen that every state 
of the unfrozen portion has exactly one pre-image, which 
is enough to guarantee that all states occur with equal 
frequency when averaged over initial conditions. In cases 
where some states have two pre-images, the number of 
recurrent states decreases by a factor of two (and may in 
some cases be reduced by additional factors of 2). Simu- 
lations clearly show that the bias of each unfrozen node 
averaged over initial conditions is still 0.5 at every time 
step, but we have not found a rigorous proof that this 
must be true. 

In the large system limit, all frozen nodes have com- 
plexity Cu{i) — 0. To compute C^, we must now deter- 
mine the complexities of the unfrozen nodes. An unfrozen 
node in the XOR-ON network can have complexity or 
1. C^{i) = occurs when an unfrozen XOR, node i, has 
outputs only to unfrozen nodes that all have additional 
unfrozen inputs. The value of each output node in this 
case is equally likely to take either value, no matter what 
the value of node i. The only way for an unfrozen node 
i to have nonzero complexity is to have at least one out- 
put to a node whose other input is frozen. The future 
light cone distribution then depends on Xi, and the past 
light cones that yield the two values occur with equal 
probability, which gives Cv{i) = 1. 

Calculating for the XOR-ON ensemble of networks 
comes down to determining the fraction of nodes that 
satisfy the condition for having C^(j) = 1, which can be 
obtained from a mean-field calculation as follows. Let Xi 
and X2 be inputs of node and assume that Xi has 
no other output. If we assume that Xi is an unfrozen 
node, then the probability that X2 is frozen and X3 is 
unfrozen is 97. Because the only way for Xi to have 
complexity 1 is for X2 to be frozen, the probability that 
Xi has complexity 1 is 57. For an unfrozen node with 
two outputs, the complexity will be 1 if either of the 
outputs has a frozen input, which occurs with probability 
1 — (1 — (77)^. Thus the Ci, is equal to the total fraction 
of nodes with nonzero complexity: 



The precise import of the value of Cy{i) at a given 
node is not immediately clear. We have measured the 
correlation between C^{i) and a new measure S{i) that 
characterizes the effect of replacing node i with a random 
number generator. S{i) is determined as follows. Two re- 
alizations of the same network are run in parallel with 
the same initial condition. After running long enough 
for transients to decay, the logic function fi is ignored in 
one of the copies and node i is replaced by a stochastic 
agent generating = 1 or with probabilities p and 
1 — p, respectively, on each time step. Over the course 
of M steps, we calculate the fraction of time that each 
node in the network is ON for each of the realizations, 
forming two A^-dimensional vectors. The Euclidean dis- 
tance between these two vectors is denoted S{i,p), and 
we define 6{i) = min{(5(i,p) : p € [0, 1]}. We obtain an 
approximate measure of S{i) by taking M — ION and 
considering p = 0.1 n for n = 0, 1, . . . , 10. A study of 
the XOR-AND and IF-AND ensembles shows that the 
correlation coefficient between Ci,{i) and 6{i) in the dis- 
ordered regime ranges between 0.4 to 0.7 for network size 
N=1000. This suggests an interpretation of Cy{i): it is 
a measure of importance of the logical processing per- 
formed at node i for determining the global dynamics. A 
low value of (i) means that the computations done by 
node i can be effectively simulated by a random number 
generator. 

We note that correlations associated with the existence 
of two paths in the network from one node to another can 
cause frozen nodes to have nonzero C^(i). The correla- 
tion arises when a path of unfrozen nodes from some node 
j passes through the frozen node i and then immediately 
to an unfrozen node fc, while another path of unfrozen 
nodes of the same length goes from j to k without pass- 
ing through i. Let ji be the input to i along the first 
path. Then Xji and Xk may be correlated due to the 
common source at node j, in spite of the fact that node 
i does not pass on any information. (See Fig. [4]) In the 
large system limit, the fraction of nodes affected by this 
type of correlation vanishes. 



a = (i-7)(i-(i-97n- 

Using Eq. [18] we have the system average 

a = i(l-g2)(2g-l). 



(19) 



(20) 



Eq. 20 agrees well with simulation results (Fig. [s]) . 

The XOR-ON example shows that C^, can be calcu- 
lated for specific distributions of logic functions and, 
more importantly, illustrates that the complexity of an 
individual component depends on globally determined 
dynamics, not just on the logical process carried out by 
the individual node or on the smaller network motifs con- 
taining the node. The pattern of frozen nodes is gener- 
ated by a transient process that may propagate through 
the entire network 1141. 




FIG. 4. An example of a network structure that can cause a 
frozen nodes to have nonzero Cvii)'- node i is frozen, but its 
past and future light cones are correlated because nodes j\ 
and k receive input from the same node j through chains of 
unfrozen nodes that differ in length by exactly two links. 
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Because S{i) is necessarily zero for any frozen node, 
the existence of multiple paths in a finite system causes 
some nodes with high C^{i) to have low S{i). We further 
note that d{i) itself is not a perfect measure of dynamical 
importance of a given node's activity because the aver- 
age Euclidean distance between the bias vectors may be 
small even though the sequence of states is substantially 
different. In fact, we observe that 6{i) saturates at low 
Cu{i) values. Comparing C,^{i) with more precise mea- 
sures of dynamical importance would be an interesting 
topic for future research. 



IV. CONCLUSION 

In extending the formalism of Shalizi's complexity 
measure to random Boolean networks, a distinction must 
be made between determining causal states by averaging 
over nodes at a given time step versus averaging over time 
for each node separately. The two methods are equivalent 
for systems described by the same input-output function 
at every node of a uniform lattice, but yield different re- 
sults for systems that are spatially inhomogeneous either 
because the input-output functions are different for dif- 
ferent nodes or because the network topology is not a 
regular lattice. The networks we have studied have both 
types of inhomogeneity. 

We find that Cf,, obtained from statistics at a single 
time step, can be calculated analytically for RBN's, and 



that it is not simply related to sensitivity to small pertur- 
bations. The maximum of can occur in the ordered, 
disordered, or critical regime, depending on the details of 
the probability distribution chosen for the Boolean rules 
in the network. is directly related to the steady state 
(or long-term oscillatory) bias in the node values. 

C^, obtained from statistics compiled over time for in- 
dividual nodes, is zero (in the large system limit) every- 
where in the ordered and at the limiting value of sensi- 
tivity in the disordered regime, so it is maximized some- 
where in the disordered regime. The value at an indi- 
vidual node, Ci,{i), may be interpreted as a measure of 
the role that node in determining the global dynamics. 
Nodes that can be replaced by random number genera- 
tors without substantially altering the global dynamics 
(and hence perform no essential information processing) 
tend to have lower Ci,{i). This last finding may be help- 
ful in characterizing the behavior of social or biological 
systems and the individual agents or components that 
comprise them. Interestingly, the Ciy(i)'s depend on the 
global features of the network that determine its attrac- 
tors, not just on the local logic functions and topology. 
In other words, the identification of important players in 
a network requires global information about the network, 
not just a characterization of each individual's behavior. 
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